Bayesian Hierarchical Modeling of
Cell-Type-Specific Gene Regulation

Michael Love,
Assistant Professor,
Dept of Biostatistics
Dept of Genetics

Biological aim: cell-type-specific regulation

In particular, effect of common genetic variation on …

Mouse brain, dentate gyrus. Livet, Weissman, Sanes and Lichtman/Harvard University

Bayesian Hierarchical Models

Let’s demystify — what do we plan to do


  • A parametric model for regulation
  • Parameters will represent an abstract relationship
  • E.g. SNP \(\rightarrow\) expression, or SNP \(\rightarrow\) chromatin accessibility
  • A hierarchy to compare these across cell types
  • Specify distributions relating parameters to e/o
  • Bayes theory and comp. frameworks to make inference

What is a likelihood?

\[ y_i \sim N(\mu, 1) \]

Bayesian model

What is a posterior? Likelihood \(\times\) some other distribution

What is a Bayesian Hierarchical Model?

  • Like before, Bayesian model for data \(y_i\)
  • But there is some structure in the data
  • E.g. data on schools, schools are in grouped into regions
  • Multiple \(\mu\) to estimate (say, one per school)

A simple hierarchical model to start



\[ \begin{align} y_i &\sim N(\mu_i, \sigma^2=1) \\ \mu_i &\sim N(0, \sigma^2=A) \\ \end{align} \]

\[ \hat{\mu}_i^{Bayes} = \left(1 - \frac{1}{A+1} \right) y_i \]

“Stein’s Estimation Rule and Its Competitors - An Empirical Bayes Approach.” Bradley Efron and Carl Morris,
Journal of the American Statistical Association, Vol. 68, No. 341 (Mar., 1973), pp. 117-130 (14 pages) DOI: 10.2307/2284155

Let’s plug in some numbers

Let’s try A = 1 (SD = 1) and A = 9 (SD = 3)

What if A is not know?


\[ \hat{\mu}_i^{James-Stein} = \left(1 - \frac{1}{Z} \right) y_i \]

Empirical Bayes = estimate distribution of \(\mu_i\) using \(y_i\)

where \(Z = \frac{1}{n-2} \sum_{i=1}^n y_i^2\)

Roughly: estimating A + 1 = observed variance

This makes sense: \(y_i = \mu_i + \varepsilon_i\)

Let’s try this out

\[ \begin{align} \hat{\mu}_i^{MLE} &= y_i \\ \hat{\mu}_i^{naive} &= 0 \end{align} \]

A more complex model

Schools (\(\mu_i\)) clustered in regions (\(\theta_k\)).

\(k(i)\) gives the region for school \(i\).

\[ \begin{align} y_i &\sim N(\mu_i, 1) \\ \mu_i &\sim N(\theta_{k(i)}, 1) \\ \theta_k &\sim N(0, 5) \end{align} \]

  • Not so simple to compute, switch to using a tool called Stan
  • Many frameworks for performing MCMC where one specifies the data, parameters, and model
  • Then “samples” (of parameters) from the posterior
  • Stan is very fast.

How does this look in Stan?


\[ \begin{align} y_i &\sim N(\mu_i, 1) \\ \mu_i &\sim N(\theta_{k(i)}, 1) \\ \theta_k &\sim N(0, 5) \end{align} \]

The full model in Stan

Model output (n=40, k=4)

This model samples very fast in Stan (hundreds of a second). We can look at posterior distributions:

Compared to truth

Look at the school level (\(\mu_i\)):

Did we do better than MLE?

Did we do better than MLE?

Look at the school level (\(\mu_i\)):

When will Bayes \(>\) a simpler per-region estimator?

James-Stein shrinks by \(\approx \left(1 - \frac{V_y}{V_\mu + V_y} \right)\)

When are Bayesian models useful?


  • Bayes estimators converge to MLE as \(n \to \infty\), so small \(n\)
  • Or, small \(n\), higher variance for some groups
  • E.g. some regions having only a few schools
  • Or, sharing information on mid-level parameters, e.g. \(V_\mu\)

BHM in genomics

  • limma, edgeR, DESeq2, apeglm
  • These are empirical Bayes, don’t involve MCMC

BHM in eQTL and GWAS

  • eCAVIAR - estimated coefficients ~ multivariate normal
  • Zhang, …, Chatterjee, Estimation of complex effect-size distributions using summary-level statistics Nat Gen (2018)


Modeling TF binding across cell types

  • Love et al, Role of the chromatin landscape and sequence in determining … binding NAR (2017)

Modeling TF binding across cell types




BHM applied to neuronal assays in Stein lab


  • MR Locus
    • asks: \(\uparrow\) eQTL, \(\uparrow\) GWAS?
    • mixture modeling of all effects, no thresholding
    • uncertainty on the effect sizes
  • pathQTL
    • cell-type-specific mediation across multiple “layers”
    • as in the TF binding case, focus on cell-type diffs
    • incorporate SNR into the mediation